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Abstract 

We present a collection of integral equation methods for the solution to 
the two-dimensional, modified Helmholtz equation, u(x) — q;^Am(x) = 0, 
in bounded or unbounded multiply-connected domains. We consider both 
Dirichlet and Neumann problems. We derive well-conditioned Fredholm in- 
tegral equations of the second kind, which are discretized using high-order, 
hybrid Gauss-trapezoid rules. Our fast multipole-based iterative solution 
procedure requires only 0{N) or O(A^logA^) operations, where N is the 
number of nodes in the discretization of the boundary. We demonstrate the 
performance of the methods on several numerical examples. 

Keywords: Fast multipole method, Gaussian quadrature. Modified 
Helmholtz equation, integral equations, Yukawa potential. 

1. Introduction 

A variety of important problems in science and engineering involve the 
solution to the problem 

M(x)-a2A^(x) = /(x), (1) 

subject to appropriate boundary conditions. This equation is called the 
modified Helmholtz equation. It appears, for example, in the semi-implicit 
temporal discretization of the heat or the Navier-Stokes equations 01 (here 
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o? would be proportional to the time step), and in the linearized Poisson- 
Boltzmann equation. The underlying free-space Green's function is referred 
to as the Yukawa or screened-Coulomb potential. 

In Cheng et al. present a fast direct solver for ([1]) in two dimensions 
on the unit square. The solution is expressed as a volume potential, and 
the direct solver is accelerated using a new version of the fast multipole 
method [g], [loj. The solver is fully adaptive and the computational costs are 
comparable to those of FFT-based methods. Our aim, here, is to complement 
this work. In order to solve ([1]) in more general domains, solutions to 



with prescribed boundary conditions are required. Here, we present integral 
equation methods to solve ([2]) subject to Neumann or Dirichlet boundary con- 
ditions in multiply-connected domains, which may be bounded or unbounded 
in extent. 

Representing ([2]) as an integral equation is a natural choice, and computa- 
tional methods based on an integral equation formulation can have significant 
advantages over conventional finite difference or finite element techniques. 
Integral equation methods are naturally adaptive, easily allow for high-order 
approximations, and can handle arbitrarily complex boundaries. However, 
the discretization of most integral operators yield dense matrices, and in the 
absence of fast algorithms to solve these systems, integral equation methods 
are rarely competitive. 

We present fast-mult ipole accelerated methods for solving integral equa- 
tion representations of This work is meant to join a growing collection of 
fast-multipole-accelerated integral equation methods for linear, elliptic oper- 
ators 0, H, [q], [sj . The fast multipole method (FMM) was first introduced by 
Greengard and Rokhlin as an efficient way to evaluate the Coulomb poten- 
tial due to a collection of charged particles jsj . It has since been extended to 
handle different potentials, in both two and three dimensions. Of particular 
interest here are the FMM methods that have been developed to compute 
volume potentials for the Yukawa potential in two dimensions 0], and parti- 
cle interactions for the three-dimensional Yukawa potential jcj . Our methods 
have the following key elements: 

• Well-conditioned Fredholm integral equations of the second kind are 
formulated. 




(2) 
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The integral equations are discretized with tailored, high-order accurate 
hybrid Gauss-trapezoid rules. 



The resulting linear systems is solved using a GMRES scheme [18 



• The FMM for the two-dimensional Yukawa potential is exploited to 
compute the matrix- vector products in the iterative solution procedure. 

With points in the discretization of the boundary, our methods require 
only 0{N) or O(A^logA^) operations. 

There has been relatively little work done on developing integral equation 
methods for the modified Helmholtz equation. The majority of the work 
appears to be focused on the linearized Poisson-Boltzmann equation, of which 



13l . |14J . |15| are recent examples. Here, the interest is typically to perform an 
electrostatic analysis over a single complicated molecular surface, with the 
electrostatic potential satisfying certain jump conditions across the boundary. 
Our intended application is to couple our work with that presented in Q in 
order to solve the equations that arise from the temporal discretization of the 
heat equation, for example. Thus, we are more concerned with developing a 
general-purpose solver. 

There is a growing body of literature on fast algorithms for boundary inte- 



gral equation methods. Nishimura presents a thorough review in [17|. Recent 



technological advances include using a "kernel-independent" fast-summation 



algorithm, as discussed by Ying et al. in [19|. Their method results in 
an 0{N'^/'^) scheme. Another exciting new development includes fast direct 
solvers jlil, |l6[, which directly construct a compressed factorization of the 
inverse of the matrix arising from the discretization of certain boundary in- 
tegral equations. These methods are particularly attractive to problems in 
which it is of interest to solve the equations with multiple right-hand sides. 

While significant progress has already been made in developing efficient 
integral equation methods for linear elliptic equations, similar treatment of 
time- dependent equations has lagged behind. Also, the application of these 
techniques to practical problems of interest in the broader scientific com- 
munity has been limited. The reason seems clear; the majority of partial 
differential equations that arise from modelling real-world phenomena are 
both time dependent and nonlinear. In |2] , Greengard and Kropinski argued 
that after both discretizing in time and applying a suitable linearization, the 
Navier-Stokes equations can be rendered amenable to treatment by integral 
equation methods. However, in order to realize this, certain "building-block" 
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Figure 1: A bounded multiply- connected domain D. The outer boundary is denoted by Tq 
(which is absent in unbounded domains) and the interior component curves byVi, - ■ ■ ,Tm- 
The unit normal n points out of D on each component curve. 

tools are still needed. Our work is meant to provide one such tool and bring 
us one step closer to making integral equation methods an attractive alter- 
native to conventional finite element and finite difference methods for many 
large-scale problems in science and engineering. 

The paper is organized as follows: In section 2, we discuss the potential 
theory for the modified Helmholtz operator and the corresponding integral 
equation formulations. In section 3, we present our numerical methods, with 
the fast-multipole method being briefiy outlined in section 4. Numerical 
examples are presented in Section 5. 

2. Potential theory and integral equation formulations 

To fix notation, let us consider a domain D with boundary F which is 
M or (M + l)-ply connected (Figure [1]). We are interested in solving ([2]), 
subject to both Dirichlet and Neumann boundary conditions. In both cases, 
we start with the free-space Green's function G{y — x) = G{yi — Xi,y2 — X2) 
for the operator 1 — a^A, 



where Kq is the zeroth-order modified Bessel's function of the second kind. 
The Dirichlet problem is considered in section 2.1 and the Neumann problem 




in 2.2. 
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2.1. The Dirichlet Problem 

Consider the following Dirchlet problem: 

m(x) - a^Au{x.) = 0, X G 

m(x) = /(x), xGT. (3) 

If D is unbounded, the appropriate decay conditions for the solution is 
u(x) —7- as |x| — 7- oo. We seek the solution u(x) in the form of a dou- 
ble layer potential, 

where cr(y) is the value of the unknown density at the boundary point y, 
and d/dny represents the outward normal derivative at the point y. 

In order to derive the form of the integral equation based on the rep- 
resentation dl]), we must understand the behaviour of the potential on the 
boundary T. We note that as 2; — )■ 0, 

Ko{z)r^-logiz)+Q{z), 

where Q{z) is a polynomial in ^ [ij. Thus, (j4]) behaves asymptotically as if 
it has a logarithmic kernel, and we note for future reference that 

lim f = -^«:(x), (5) 

y-+x (juy \ a J 2 

x,y e r 

where k{'x) denotes the curvature of F at the point x. 

The jump relations for potentials of the logarithmic kernel are well known, 
and therefore, for any point x on the boundary F, 

X.' E D 

Using the first limit in the preceding expression, and matching to the bound- 
ary condition (|3]), we obtain an integral equation for the layer density a: 
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2.2. The Neumann Problem 

Consider the following Neumann problem: 

m(x) - a^AM(x) = 0, X G -D, 
We seek the solution m(x) in the form of a single layer potential 

^x) = J-,|^i^o(^) -{y)ds,. 

In order to satisfy the boundary conditions, the density cr(x) must satisfy 
the integral equation 

We note that from ([5]), the kernels in the integral equations (E]) and 
are bounded and continuous along F. In addition, there are no nontrivial 
homogeneous solutions. Therefore, by the Fredholm alternative, ([6]) and ([7]) 
have unique solutions for any integrable data /(x) or 5'(x). 

In summary, equations and ([7]) can be written in the general form: 

a(x) + - / K{y, x)a(y) dsy = F(x). (8) 

In the case of the Dirichlet problem, we have 

a \ a / |y — x| 
F(x) = -2aV(x). 

In the case of the Neumann problem, we have 

A-(y.x) = iA-,(^)^.„,. 

« V / |y ^ x| 

F(x) = 2a2^(x). 
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3. Numerical methods 



We assume each component curve Tk, k = 1, ■ ■ ■ , M, is parametrized by 
y'^(Q;), where a G [0,27r). Similarly, a^{a) refers to the restriction of the 
density cr on F^. On each contour r^, we are given N points equispaced with 
respect to a. Thus the mesh spacing is h = 27r/N, and the total number of 
discretization points is NM. Associated with each such point, denoted by 
y^, is an unknown density a^. The derivative dy /da is denoted by w'^, and 
the derivative values w*^ are obtained pseudospectrally. 

The presence of the logarithmic singularity in the integral operator causes 
the spectrum of kernels based on the Yukawa potential to decay slowly. Ap- 
plying the straightforward Nystrom discretization based on the trapezoidal 
rule would result in a significant loss of accuracy. Instead, we use hybrid 
gauss-trapezoidal quadrature rules developed by Alpert [2,] which are tai- 
lored for integrands with logarithmic singularities. These quadratures are 
of order h^logh. The order p determines a set of nodes Un and weights 
n = 1 ■ ■ ■ /. These nodes and weights are used for the quadrature within the 
interval a G [aj — ha,aj + ha], on (a and / are also determined by p). 
Outside of this interval, the trapezoid rule is used. Applying this quadrature 
to dH]) yields 



In the second sum, we invoke periodicity of all functions on F^, or equiva- 
lently, j + = j- In the final sum, we are required to know values of a 
intermediate to the nodal values at a = aj ± hv\n\- In these cases, we use 
Fourier interpolation. 

Equation dH]) is a dense MN x MN linear system that must be solved 
for the unknowns cr^. In our implementation, (|9]) is solved iteratively, using 
GMRES [18]. The bulk of the work at each iteration lies in applying the linear 
system to a vector. Done directly, this would require O(A^^M^) operations 
for each iteration. This cost can be reduced to 0{NM) by using the adaptive 
fast multipole method, the details of which are outlined in the next section. 





(9) 



n — —I 

n ^ 
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Since the number of iterations needed to solve a Fredholm equation of the 
second kind to a fixed precisionis bounded independent of the system size 

(holding the geometry fixed), the total number of operations required to 
solve © is 0{NM). 



4. The fast multipole method 

Consider a collection of N particles or "sources" in R^, xi, X2, x^v, to- 
gether with corresponding source strengths gi, g2, ■ ■ ■ , (In- We are interested 
in efficiently evaluating the following: 

^ /I - l\ 

$(x,)= g,iro(^^^^^J, J = l,---,iV. (10) 



i = 1 
i # 3 



The fast multipole method was developed to evaluate fields based on 



the Coulomb potential in two dimensions [3|, Ijjj. The FMM has since been 
extended to include other potential functions, such as the three-dimensional 
Yukawa potential fo!]. An FMM developed to evaluate volume potentials 
based on the two-dimensional Yukawa potential appears in [3]; we use this 
work as the basis for our "particle-to-particle" FMM required to evaluate ([9]). 
Our FMM is identical in structure to the one presented in with only minor 
modification. For more detail, we refer the reader to this work and others 
@, 10 1 which discusses the use of exponential expansions. Here, we present a 
minimal sketch of the FMM algorithm as it applies to our problem at hand. 

The FMM uses an adaptive quad-tree structure in order to superimpose 
a hierarchy of refinements on the computational domain. We imbed the 
geometry inside a unit square 5, which is considered to be grid level 0. Grid 
level Z + 1 is obtained recursively by subdividing each square s at level / into 
four equal parts, which are the "children" of s. Adaptivity is achieved by not 
requiring the same number of levels of subdivision in all regions of S. The 
basic idea of the FMM is that for each particle, contributions from "nearby" 
(neighbour) particles to the potential field are handled directly via (ITOll . while 
"far- field" (non-neighbour) interactions are handled using multipole and/or 
related expansions. 

The first step in the FMM is to form the multipole expansions for all of 
the nodes (boxes) in the quad-tree structure. The following theorem follows 
from Graf's addition theorem, and corresponds to Theorem 3 in j3] with 
modifications made for particle to particle interactions: 
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Theorem 4.1 (Multipole Expansion). Let s be a node in the quad-tree 
centred at s = (si, S2). Assume s is not a neighbour of B. Then the potential 
$(x) due to s for :x. G B is given by the multipole expansion 



Xi6B Z=— 00 



V c R V 



Xi - s 



q; 



i/ere, Ii and Ki are the order modified BesseVs function of the first and 
second kind, respectively, and o.nd denote the polar angles for x and 
Xj with respect to s. 

The second step of the FMM is to construct a local expansion for each 
node B. In order to translate the multipole expansions to local ones, we 
apply the following result: 

Theorem 4.2 (Local Expansion). Suppose a multipole expansion associ- 
ated with node s centred at s is given by 



00 / 1 _ I \ 



Further assume s is in the 'far field" of B. Then for any x & B, $(x) can 
be represented as a local expansion 



00 

4.(x) = J: L,I, (9 



/=— 00 



where {p,0) denote the polar coordinates ofx with respect to the centre of B, 
and the local coefficients are expressed using the ^'multipole to local" transla- 
tion operator defined by 



g -i{l-n)9o 



n=— 00 

Here, {pq,9q) are the polar coordinates of the centre of B with respect to s. 
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The work required in the above Theorem represents the bulk of the op- 
eration count in the FMM. This work can be significantly reduced by using 
plane wave representations and a "diagonal translation operator" (c.f. Ap- 
pendices A and C in Q). The savings in cost are shown in table [Tl The rest 
of the original FMM procedure proceeds as in sections 2 and 3 in , and we 
will not elaborate further. 





Direct 


Original FMM 


Plane- Wave FMM 


64 


0.01 


0.06 


0.06 


128 


0.03 


0.08 


0.08 


256 


0.08 


0.22 


0.12 


512 


0.30 


0.47 


0.21 


1024 


1.15 


1.03 


0.36 


2048 


4.59 


2.02 


0.60 


4096 


18.26 


3.86 


1.36 


8192 


73.42 


9.46 


2.02 


16384 


292.52 


14.80 


4.60 


32768 


1168.22 


38.44 


7.04 


65536 


4689.96 


56.55 


17.42 



Table 1: A comparison of the CPU time (in seconds) required to compute the potential 
due to a set a points directly, via the original FMM based on multipole expansions, or via 
the FMM based on plane- wave expansions. Half of the points are randomly placed in the 
unit square and the other half are concentrated on two elliptical boundaries. 



Evaluating (jH]), however, is not as simple as calculating a sum in the form 
of f lTU]) . In the case of the Neumann problem, we are required to evaluate a 
potential of the form 



This evaluation is relatively straightforward as Vxj ^'(xj) relies on the same 
multipole coefficients as $(xj). In the case of the Dirichlet problem, we are 
required to evaluate a potential of the form 

, . „ _ /|x 
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In this case, the multipole coefficients do change: 




Figure 2: The solution to Example 1. 



5. Numerical Results 

The algorithms described above have been implemented in Fortran. The 
tolerance for convergence of GMRES is set to 10~^^. Here, we illustrate the 
performance on a variety of examples. All timings cited are for a Mac Pro 
2.1 with two 3GHz Quad-Core Intel Xeon processors. 

Example 1: We ffist consider the problem of solving ([2]) with a = 0.1 in 
a bounded, circular domain with ten interior elliptic contours (see figure |2]). 
We generate the Dirichlet boundary conditions from 

10 /I _ |\ 

-w = E^o , (11) 

k=l ^ ^ 

where is a point inside Tk- We test the performance of our methods using 
quadrature rules of varying degree of accuracy. The results are shown in 
table |2] through table [61 These tables conffim that the number of GMRES 
iterations required for convergence is independent of A^. Also, we see near- 
linear scaling of the CPU time with A^. In terms of overall accuracy, we see 
the best overall performance with the 0{h^logh) scheme. 
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N ^ Iterations CPU Time Error 

^04 45 12l 3.8010-*^ 

1408 45 20.3 3.3610-^ 

2816 45 40.1 4.20 10-s 

5632 45 83.9 5.25 10"^ 



Table 2: Performance of the algorithm on example 1. Trapezoid Rule. 



N # Iterations CPU Time Error 

^04 45 12^3 9.95 10"^ 

1408 45 20.2 3.6010"^ 

2816 45 41.1 5.71 10-s 

5632 45 87.7 8.65 10"^ 



Table 3: Performance of the algorithm on example 1. Quadrature is 0{h^ log/i). 



^ Iterations CPU Time Error 

loi 45 r2^5 1.11 10"' 

1408 45 21.1 4.9210" 

2816 45 42.3 3.1910" 

5632 45 88.9 2.2610" 



Table 4: Performance of the algorithm on example 1. Quadrature is 0(/i^ log /i). 



N # Iterations CPU Time Error 

^04 45 l3^3 1.1110"^ 

1408 45 22.5 5.4110"^' 

2816 45 45.0 5.88 10"^' 

5632 45 94.2 8.5510"^' 



Table 5: Performance of the algorithm on example 1. Quadrature is 0{h^logh). 
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N 


^ Iterations 


CPU Time 


Error 


704 


45 


14.7 


1.1110"^ 


1408 


45 


25.2 


5.5010-1° 


2816 


45 


50.2 


1.4010-11 


5632 


45 


104.5 


1.4410-11 



Table 6: Performance of the algorithm on example 1. Quadrature is 0{h^^ ^ogh). 

Example 2: We now consider an exterior Neumann problem, whose bound- 
ary conditions are, again, derived from f llip . The geometry is also the same 
as in the preceding example, minus the outer boundary. Based on previ- 
ous results, we select the 0{h^\ogh) quadrature rule, and we examine the 
performance of the methods for different values of a in ([2]). The results are 
shown in table [7] through table [lOl Interestingly enough, the conditioning of 
the linear system appears relatively independent of a for a > 0.1. This is 
not the case for the Dirichlet problem, where we observed that the number 
of GMRES iterations visibly increases as a increases in size. This is to be 
expected, as the governing equation becomes more harmonic, and the inte- 
gral equation based on the double layer potential for Laplace's equation in 
multiply connected domains is known to become rank deficient Since we 
are anticipating applications in which a is proportional to a time step, we 
are not concerned here with this behaviour. 





^ Iterations 


CPU Time 


Error 


704 


23 


4.3 


1.9910-6 


1408 


23 


6.8 


2.0810-11 


2816 


23 


13.5 


1.7110-13 


5632 


23 


26.7 


1.8110-13 



Table 7: Performance of the algorithm on example 2 with a = 10. 

Example 3: We next consider a larger-scale problem. We compute the solu- 
tion to the modified Helmholtz equation with a = 0.1 in a bounded domain 
with 100 elliptical contours that have varying proportions and alignment (fig- 
ure [3l On each of these contours, we prescribe a constant Dirichlet boundary 
conditions selected randomly from (—1,1). Each contour is discretized with 
512 points, resulting in a matrix of order 51712. The total CPU time to solve 
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N # Iterations CPU Time Error 

1m 23 45 1.83 lO-*^ 

1408 23 6.9 1.8710"^ 

2816 23 13.7 4.37 10"^' 

5632 23 26.7 6.33 lO"^' 



Table 8: Performance of the algorithm on example 2 with a = 1. 



N # Iterations CPU Time Error 

~704: 25 6^5 5.4310"^ 

1408 25 8.8 6.91 10-1' 

2816 25 16.4 4.67 lO'^' 

5632 25 31.6 4.21 lO'^' 



Table 9: Performance of the algorithm on example 2 with a = 0.1. 



N ^ Iterations CPU Time Error 

^04 15 3^6 6.3310- 

1408 15 6.9 6.9110- 

2816 15 14.5 2.7710- 

5632 15 27.0 6.4510- 



Table 10: Performance of the algorithm on example 2 with a = 0.01. 
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Figure 3: The solution to example 3. There are 512 points per hole, N = 51712, 88 
GMRES iterations are required, consuming 1510.7 s of CPU time. The plot on the right 
shows a close-up of the solution from a small region from the upper left quadrant of the 
domain. 

this linear system to a tolerance of 10~^^ required approximately 25 minutes. 
Example 4: While the domains in the preceding examples are multiply 
connected, the individual contours have consisted of rather simple shapes. 
In this example, we consider more complex contours (figure H]). We solve 
equation ([2]) in an unbounded domain with a = 0.1. Dirichlet boundary 
conditions are assigned in the following way: we assign a value of 1 on the 
wheels of the truck, a value of 2 to the rest of the truck and a value of 3 on 
the person. 

6. Conclusions 

We have presented a class of integral equation methods for the solu- 
tion to the modified Helmholtz equation in bounded or unbounded multiply- 
connected domains. Using a fast-multipole accelerated iterative method, our 
solution procedure requires 0{N) operations, where is the number of nodes 
in the discretization of the boundary. With these techniques, and using only 
modest computational resources, we were able to accurately and efficiently 
solve large-scale problems in complex domains. 

The immediate focus of future work is to develop integral equation tech- 
niques to solve ([1]) in arbitrary geometry. To do this, we must couple our 
solver with the solver presented in Once that is achieved, we will apply 
these integral equation techniques to solve the equations that arise from the 
temporal discretization of convection-diffusion type equations. In this way. 



15 



0.5 

lo 



Figure 4: The solution to example 4 with N— 4-740. 4^ GMRES iterations are required, 
requiring 36.4 s of CPU time. 

we hope to demonstrate that integral equation methods offer an attractive 
alternative to conventional finite element and finite difference methods for 
many large-scale problems from engineering and physics. 
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